-
Notifications
You must be signed in to change notification settings - Fork 19.7k
Introduces QuantizationConfig for fine-grained quantization control #21896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Introduces QuantizationConfig for fine-grained quantization control #21896
Conversation
Summary of ChangesHello @JyotinderSingh, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the quantization capabilities within Keras by introducing a flexible Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a QuantizationConfig to provide a more flexible and customizable quantization API. This is a significant improvement, allowing users to specify their own quantizers for weights and activations, and enabling features like weight-only quantization. The changes are well-implemented across various layers including Dense, EinsumDense, Embedding, and ReversibleEmbedding, as well as the model-level quantize method. The new QuantizationConfig class is well-designed with serialization support, and the accompanying tests are comprehensive. I have a couple of suggestions for minor code improvements to reduce redundancy and enhance clarity.
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #21896 +/- ##
==========================================
+ Coverage 76.30% 76.33% +0.03%
==========================================
Files 580 581 +1
Lines 60029 60184 +155
Branches 9432 9460 +28
==========================================
+ Hits 45803 45942 +139
- Misses 11750 11759 +9
- Partials 2476 2483 +7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
2ae1e37 to
a3668d5
Compare
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces QuantizationConfig to provide a more structured and fine-grained control over quantization settings. The changes are well-implemented across various layers and the new configuration class is well-designed. I've found a couple of minor issues related to an unused parameter and an outdated docstring that should be addressed.
keras/src/models/model.py
Outdated
| mode: The mode of the quantization. Only 'int8' is supported at this | ||
| time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring for the mode argument is outdated. It should be updated to reflect all supported quantization modes, such as 'int8', 'int4', 'float8', and 'gptq'. Currently, it only mentions 'int8'. It should also clarify that mode is optional if config is provided.
| mode: The mode of the quantization. Only 'int8' is supported at this | |
| time. | |
| mode: The mode for quantization, e.g., `'int8'`, `'int4'`, `'float8'`, | |
| or `'gptq'`. This is optional if a `config` object is provided. |
| return "float8" | ||
|
|
||
|
|
||
| def validate_and_resolve_config(mode, config, name=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3a31239 to
6917701
Compare
No description provided.